%File: formatting-instruction.tex
\documentclass[letterpaper]{article}
\usepackage{aaai}
\usepackage{times}
\usepackage{helvet}
\usepackage{courier}
\usepackage{url,color,graphicx}
\usepackage{tikz,rotating}
\usetikzlibrary{shapes,shadows,arrows,positioning}
\frenchspacing
\pdfinfo{
/Title (Nehovah)
/Subject (Computational Creativity)
/Author (Michael R. Smith and Ryan S. Hintze and Dan Ventura)}
\setcounter{secnumdepth}{0}  
 \begin{document}

\tikzstyle{decision} = [diamond, draw, fill=blue!50]
\tikzstyle{line} = [draw, -stealth, thick]
\tikzstyle{elli}=[draw, ellipse, fill=blue!50,minimum height=8mm, text width=5em, text centered]
\tikzstyle{block} = [draw, rectangle, fill=blue!50, text width=8em, text centered, minimum height=15mm, node distance=5em]

% The file aaai.sty is the style file for AAAI Press 
% proceedings, working notes, and technical reports.
%
\title{Nehovah: Creativity in Generating Neologisms}
\author{Michael R. Smith \and Ryan S. Hintze \and Dan Ventura\\
Department of Computer Science\\
Brigham Young University\\
Provo, UT 84602\\
}
\maketitle
\begin{abstract}
\begin{quote}
% Language is a rich medium in which one can express his or her ideas, emotions, and creativity.
% Human language is also ever changing to describe new concepts and ideas.
In this paper, we describe a computationally creative system (\textit{Nehovah}) that generates neologisms from a set of base words provided by a user.
We focus on the quality and creativity of the generated neologisms by taking into account various aspects of the neologism such as how apparent the original ideas are portrayed in the neologisms and how ``catchy'' the neologisms are.
We also include references to pop culture that change over time.
Nehovah allows the user to specify which aspects of the neologism are the most important.
As such, Nehovah is ideal for creative assistance.
%Think how to align this with creativity
\end{quote}
\end{abstract}

% RYAN for your changes uses this:
% \textcolor{blue}{Put changes in here}

\section{Introduction} 
%Argue how generating new words is a form a creativity. Maybe look at some Veale stuff and see how he argues it
Human language is highly dynamic and changes as times and preferences change.
Novel words, or neologisms are created to describe new ideas, concepts, or simply to express creativity and set oneself apart from others.
The creation of neologisms is often essential for effective communication.
This is seen in a limited setting as a young child develops their vocabulary.
For example, a young child may use the word ``map-ball'' to describe a globe.
Regardless of the motivation, generating neologisms is a creative act that has been and will be endlessly repeated throughout human existence.
In this paper, we examine computational creativity through the generation of neologisms.

Creativity in itself is difficult to define, yet some products or people are often labelled as being creative or not.
Boden \cite{Boden1994} made one of the first attempts to formalize the notion of creativity.
Based on her formalization, computational creativity is often thought of as an exploration of a conceptual space and has been examined in a number of different areas such as generating paintings \cite{Colton2008,norton2011iccc} and melodies \cite{Cope2005}.
In natural language, computational creativity has been explored in areas such as poem generation \cite{Rahman2011}, metaphor generation \cite{Veale2007}, and sentence generation \cite{Mendes2004}.

Neologisms have previously been examined computationally.
The \textit{Zeitgeist} system \cite{Veale2006} examines finding the meaning in neologisms that were formed by blending two source words.
Zeitgeist is intended as a tool to enrich natural language processing tools such as WordNet with modern words that are often found in every day speech.
To do so, Zeitgeist utilizes Wikipedia (\url{www.wikipedia.com}) to identify neologisms and determine their source words.
The concepts are identified using ideas from concept blending \cite{Veale2000} where differing concepts are blended together by identifying what the concepts have in common.
Cook and Stevenson \cite{Cook2010} propose to find the meaning in neologisms using a statistical model that draws on observed linguistic properties of blends.
The linguistic properties are primarily based on how recognizable the source words are in a blend.
Thus, how apparent the concepts are conveyed in the neologism is an important aspect of the quality and creativity of a neologism.
Duch \cite{Duch2007} created neologisms using a neurocognitive model inspired by the processes of the brain.
Duch focuses on mimicking how the brain functions.
One of the shortfalls of his approach is that the neologisms often contain little to no information of the concepts used to generate them except for the pool of candidate letters used to create the neologisms.

Our system for generating neologisms, \textit{Nehovah}, utilizes ideas from the previous work.
Nehovah focuses on generating neologisms that are a blend and seeks to preserve the concepts that are given in the source words (as opposed to generating neologisms that represent entirely new ideas by themselves i.e. words such as ``Google'').
Nehovah also takes into account information from social media to incorporate a dynamic source of pop culture into the neologisms. %as well as base words from a data base (WordNet \cite{wordnet}).
Nehovah also provides an interface allowing the user to specify which of several aspects of a neologism are the most important to them.
This provides the flexibility for Nehovah to be used in creative assistance as well as an autonomous system.
\begin{figure*}[thp]
 \begin{center}
\begin{tikzpicture}[
nonterminal/.style={
% The shape:
rectangle,
% The size:
minimum size=6mm,
rounded corners=3mm,
% The border:
very thick,
draw=blue!50!black!50,
%
%50% red and 50% black,
%and that mixed with 50% white
% The filling:
top color=white,
% a shading that is white at the top...
bottom color=blue!50!black!20, % and something else at the bottom
% Font
font=\itshape
}]
\node (word1)  [nonterminal,draw=white!50!black!50,bottom color=white!50!black!20,text width=11mm] {\begin{center}Source Word 1\end{center}};
\node (word2) [nonterminal,below=of word1,draw=white!50!black!50,bottom color=white!50!black!20,text width=11mm] {\begin{center}Source Word 2\end{center}};
\node (synonym1) [nonterminal,right=of word1,text width=22mm,xshift=-3mm] {\begin{center}Find Synonyms\begin{itemize}
                                                                                                      \item WordNet
\item TheTopTens
                                                                                                     \end{itemize}\end{center}};
\node (synonym2) [nonterminal,right=of word2,text width=22mm,xshift=-3mm] {\begin{center}Find Synonyms\begin{itemize}
                                                                                                      \item WordNet
\item TheTopTens
                                                                                                     \end{itemize}\end{center}};
\node (blended1) [nonterminal,right = of synonym1, text width=10mm,draw=white!50!black!50,bottom color=white!50!black!20,xshift=-3mm] {\begin{center}Syn 1\\Syn 2\\Syn 3\\...\end{center}};
\node (blended2) [nonterminal,right = of synonym2, text width=10mm,draw=white!50!black!50,bottom color=white!50!black!20,xshift=-3mm] {\begin{center}Syn 1\\Syn 2\\Syn 3\\...\end{center}};
\node (blending) [nonterminal,right=of blended1,text width = 26mm,minimum height=20mm,yshift=-8mm,xshift=-3mm] {\begin{center}Blend Words
\begin{itemize}
                                                                                                                                \item Partition words by syllables
\item Combine prefixes and suffixes
                                                                                                                               \end{itemize}\end{center}};
\node (neos) [nonterminal,right=of blending,text width=10mm,draw=white!50!black!50,bottom color=white!50!black!20,xshift=-3mm] {\begin{center}Neo 1\\Neo 2\\Neol 3\\...\end{center}};
\node (score) [nonterminal,right=of neos, text width=20mm,xshift=-3mm] {\begin{center}Scoring\begin{itemize}
                                                                                                      \item Word structure
\item Syllable exchange
\item Concepts
\item Uniqueness
\item Pop culture
                                                                                                     \end{itemize}\end{center}};
\node (Finalneos) [nonterminal,right=of score,text width=10mm,draw=white!50!black!50,bottom color=white!50!black!20] {\begin{center}\underline{Top N}\\Neo 1\\Neo 2\\Neol 3\\...\end{center}};

\path (word1) edge[thick,->] (synonym1);
\path (word2) edge[thick,->] (synonym2);
\path (synonym1) edge[thick,->] (blended1);
\path (synonym2) edge[thick,->] (blended2);
\path (blended1) edge[thick,->] (blending);
\path (blended2) edge[thick,->] (blending);
\path (blending) edge[thick,->] (neos);
\path (neos) edge[thick,->] (score);
\path (score) edge[thick,->] (Finalneos);

\end{tikzpicture}
 \end{center}
\caption{An overview of how Nehovah generates neologisms through finding synonyms, belnding words, and scoring.}
\label{figure:overview}
\end{figure*}

We use Nehovah to examine computational creativity in the generation of neologisms.
Before we can declare that Nehovah is creative or not, we first need to characterize what we mean by creative.
Boden described creativity as ``the generation of ideas that are both novel and valuable.''
Thus, our goal is for Nehovah to generate novel and valuable neologisms.
It is relatively easy for a computer to combine letters and create novel words.
Having a computational system determine which neologisms are valuable, however, is a more difficult problem.
The value of a neologism could be expressed in a number of different terms such as if it is easy to say, are the underlying concepts apparent, or if it is funny.
We examine a number of different measures to evaluate the quality of neologisms, where each measure returns a different set of neologisms based on what the user deems as valuable.
Nehovah also introduces novelty by using seeking diverse words to blend.

In the next section, we briefly describe the main components of Nehovah and how they work to create and evaluate neologisms.
We then present several examples of neologisms generated by Nehovah.
We end by providing a discussion of Nehovah and providing directions for future work.


% maybe some examples in other areas (briefly mention)
% Focus on Tony Veale and Duch. Mendes with Dupond find similar words. Duch looks at how the brain works. Veale looks at mashed up words via Wikipedia. Paul cook, Automatically Identifying the Source Words of Lexical Blends in English
% We differ from them because...

% Creativity in Language (other languages)
% new words are a way of creative expression

%\section{Related Works}
% Concept Blending
% Tony Veale
% Duch guy
% Autistic kids study - ability to communicate effectively
% 

\section{Nehovah}
Nehovah generates neologisms by blending the concepts from two source words provided by a user.
%combining the concepts from a set of words provided by a user.
In its most basic form, Nehovah implements three major steps: finding synonyms, blending words, and scoring neologisms.
In the first step, fining synonyms, Nehovah utilizes several resources to find words that are similar in meaning to the provided words.
Next, the words are broken up based on where syllables may occur and the word fragments are then pieced together with fragments from other words.
Lastly, the pieced together words (neologisms) are scored.
The three steps are shown in Figure \ref{figure:overview} and are described in more detail in the following sections.

\subsection{Finding Synonyms} 
The purpose of finding synonyms is to produce more interesting words that still convey the same concept.
Using synonyms allows Nehovah to enrich the novelty of the neologisms.
Rather than only using the words that are most commonly associated with a source word, Nehovah seeks a diverse set of synonyms.
For example, ``God'' is a more diverse synonym for creator than ``maker''.
For each source word, Nehovah first utilizes WordNet \cite{wordnet} (a lexical database of words) as a means of finding words with similar meaning.
Nehovah adds words from WordNet that have a synonymous relationship with the base word using all word senses (noun, verb, adjective, and adverb).
One shortcoming of WordNet, however, is that it is somewhat dated and it does not contain current pop culture references.

To increase the novelty of the neologisms, Nehovah queries TheTopTens (\url{www.thetoptens.com}) to add synonymous pop culture references for the source words.
The TheTopTens is a website that allows users to create top ten lists and vote on them for various topics.
As an example, if ``car'' was provided as a base word, Nehovah queries TheTopTens.com for all lists that have car in the title of the list.
Several lists will be retrieved such as ``Top Ten Best Car Companies,'' ``Best Car Brands,'' and ``Best Auto Insurance Companies.''
One problem faced with using such a resource is adding irrelevant words as synonyms.
To avoid adding irrelevant words, Nehovah first determines if a list is relevant based on its title.
The words from the ``Top Ten Best Car Companies'' list are more relevant the words from the ``Greatest Songs by the Cars'' list.
Nehovah determines if a list is relevant by finding the words in the title that are descriptive and plural.
If the descriptive word directly proceeds the base word (car in this example), then that list is deemed relevant.
For example, the list ``Top Ten Best Car Companies'' would be accepted since the descriptive word ``best'' is describing the base word car.
Nehovah also looks at which words are in the plural form.
In the list ``Greatest Songs by the Cars,'' there are two plural words: ``Songs'' and ``Cars.''
The list is determined to be about songs rather than cars since ``Songs'' appears before ``Cars.''
Nehovah also accepts lists that have the base word directly before the first plural word such as ``Top Ten Car Movies'' assuming the base word is being used as a descriptor for the plural word.

Another problem with user defined lists is that some list items are more descriptive than others.
For example, the ``Best Muscle Cars'' list may contain items such as ``1961 Ford GT Mustang From Gone in 60 Seconds.''
While this information is beneficial for determining why an item made the list, it is difficult to use to generate neologisms.
Nehovah parses the list items so that the indicate descriptive information is not included.
Further, to filter out obscure (and/or misspelled) words and references, Nehovah only keeps list items that are found in WordNet.
 
\subsection{Blending Words} 
\label{section:combine}
With a set of synonymous words for each word, Nehovah next blends words from different word sets to create a set of neologisms.
%The sets of synonymous words (one set of synonymous words for each source word) could be partitioned on every letter.
The set of words are partitioned on their syllables to aid in retaining their concepts and to eliminate invalid words.
%However, one aspect of creativity is distinguishing between valuable and non-valuable artifacts.
%To address this, Nehovah aims at partitioning the words on their syllables.
Neologisms can appear awkward if words are split up without finishing the syllable.
Thus, partitioning words on syllables also reduces the number of candidate neologisms that Nehovah has to score.

Nehovah first partitions each word into a set of prefixes and a set of suffixes by dividing each word on its potential syllables.
We say ``potential syllables'' because it is non trivial to determine algorithmically where the syllables are.
This is because the pronunciation of the word is used when determining where the syllables are in a word (i.e. ``io'' could create two separate vowel sounds as in ``ion'' or a being a diphthong ``motion'').
This information is not encoded in the spelling of the word.
To account for this, Nehovah conservatively splits the words after every vowel unless there are consecutive consonants (with exception of ``sh,'' ``th,'' and ``ch''). 
At least one vowel has to be present in every word chunk.
Figure \ref{figure:syllables} shows two examples of how Nehovah breaks words into syllables to form prefixes and suffixes.
Nehovah combines the prefixes from the set of synonyms from one source word with the set of suffixes from the set of synonyms for the other source word.
Using letter counts from the words found in WordNet, Nehovah verifies that the letter sequence between the end of the prefix and start of the suffix exist and discards any neologism with a letter combination not found in WordNet.
% TODO FIND AN EXAMPLE
% Having information about the number of syllables is important for scoring as well since many neologisms replace mash two words together and maintain the original number of syllables in one of the words (this will be discussed in more detail the scoring section).

\begin{figure}[tp]
%\centerline{\includegraphics[width=2.5in]{../../ensemble/Fig1.eps}}
\begin{center}
\begin{tabular}{ll|ll}
\multicolumn{2}{c|}{compartment} & \multicolumn{2}{c}{neologism}\\
\hline
\underline{Prefixes:} & \underline{Suffixes:} & \underline{Prefixes:} & \underline{Suffixes:} \\
compartment & compartment & neologism & neologism\\
compart & ment & neolo & gism\\
compar & tment & neo & logism\\
com & partment & ne & ologism\\
\end{tabular}
\end{center}
\caption{Examples of how Nehovah breaks words into syllables, splitting between consecutive consonants and after vowels.}
\label{figure:syllables}
\end{figure}

\subsection{Scoring}
\label{section:scoring}
Once Nehovah creates a set of neologisms, the challenging part (as with most creative systems) is determining which neologism(s) is (are) the most valuable.
Evaluation is an important part of the creative process and something which humans excel at.
However, it is difficult to encode the innate knowledge and preferences of humans to algorithmically evaluate the artefacts.
% Computers, on the other hand, do well at creating new artifacts/exploring the domain, but do not evaluate the artifacts as well as a human.
%The evaluation of generated artifacts is an important phase of creativity in determining which artefacts are valuable.
Nehovah evaluates a neologism using five scoring criteria.% (word structure, syllable exchange, concepts, uniqueness, and pop culture).
The neologisms are scored in this manner so that a user can specify which types of neologisms are desired, and, hence, are scored higher.
% For example, a user could specify that they want neologisms that incorporate pop culture and unique words when Nehovah generates a set of neologisms.
% The same set of neologisms would generate different scoring if a user wanted neologisms that convey the concept and maintain the same number of syllables.
Users can try Nehovah at \url{http://axon.cs.byu.edu/~nehovah}.
% \begin{description}
\subsubsection{Word Structure.} The word structure was partially addressed during combination. 
By breaking the words down into syllables, Nehovah is able to make a better choice of candidate neologisms.
Nehovah also uses word statistics from the words in WordNet and removes any neologisms that did not have proper letter sequences.
\subsubsection{Syllable Exchange.} Syllable exchange recognizes that neologism that replace the same number of syllables from a word tend to be catchier and convey meaning.
For example, ''ginormous`` is a combination of ''giant`` and ''enormous`` by replacing the first syllable from enormous with the first syllable from giant.
Enough of enormous is left that the meaning is still apparent.
Another example is ''Linsanity,`` which replaces the first syllable in insanity with the single syllable word ''Lin`` (the last name of a professional basketball player).
This example illustrates that using overlapping letter sequences common in both of the original words can also aid in creating more valuable neologisms.
\subsubsection{Concepts.} One of the primary goals of Nehovah is to convey the concepts of the source words in the neologism.
Nehovah measures how apparent the concepts are in a neologism by first scoring how apparent the concepts are in the prefixes and suffixes.
For the words from WordNet, the concept is measured using MoreWords (\url{www.morewords.com}), a tool for crossword puzzles and other word games.
MoreWords provides information for prefixes and suffixes regarding the number of times a word with a given prefix or suffix occurs per million words.
Nehovah determines how apparent the concept is in the prefix by comparing the frequency of the word the prefix is derived from with the frequencies of other words beginning with the same prefix.
The score is calculated by first calculating a ratio $\phi$ by:
$$
\phi(w) = \frac{FPM(w)}{\sum_{x \in W}FPM(x)}
$$
where $FPM(w)$ represents the frequency per million words from MoreWords.
$concept(w)$ is then calculated using linear interpolation between empirically determined values based the value of $\phi(w)$ as shown in Figure \ref{figure:conceptScoring}. 
If $\phi(w) > 0.1$, then $concept(w) = 1.0$.
If $\phi(w) < 0.01$, then $concept(w)$ is linearly interpolated between 0 and 0.8.
If $\phi(w)$ is between 0.01 and 0.1, then $concept(w)$ is linearly interpolated between 0.8 and 1.
% This graph uses two thresholds which were empirically determined.
This method differentiates between word prefixes and suffixes that do not convey the concept, words that partially convey the concept, and words in which completely convey the concept.

\begin{figure}[tp]
\begin{center}
\input{conceptScoring.tex}
\end{center}
\caption{A graph of the scoring method for how apparent the concept is in the prefixes and suffixes of a word.}
\label{figure:conceptScoring}
\end{figure}
%$\phi$ is then normalized by using two arbitrary thresholds, $\tau_{HIGH}$ and $\tau_{LOW}$, according to:}
%\[
%concept(w) = \left\{
%	\begin{array}{l l}
%	1.0 & \quad \phi(w) > \tau_{HIGH}\\
%	T   \dots 1.0 & \quad \tau_{LOW} \leq \phi(w) \leq \tau_{HIGH}\\
%	0.0 \dots T & \quad \phi(w) < \tau_{LOW}\\
%	\end{array}\right
%\]


%\textcolor{blue}{where $T$ represents an arbitrary intermediate value to normalize with and $0.0 < T < 1.0$.
%$concept(w)$ is linearly interpolated between the boundaries in which it resides based on $\phi(w)$.
%This method allows a cutoff for words that do not convey the concept ($\phi < \tau_{LOW}$), helps differentiate between words that partially convey the concept ($\tau_{LOW} \leq \phi \leq \tau_{HIGH}$) and accepts words in which the concept is apparent ($\phi > \tau_{HIGH}$).
%The values used by Nehovah are $\tau_{HIGH} = 0.1$, $\tau_{LOW} = 0.01$, and $T = 0.8$ which were resolved expirementally.

%TODO Ryan put in some details about how the concept is scored


For pop culture, in addition to using MoreWords, Nehovah also counts the number times that a word appears in the lists from TheTopTens.
Nehovah then queries Wikipedia (\url{www.wikipedia.org}) for the pop culture word and counts the number of links that refer to the source word.
The pop culture concept score is a combination of the counts from TheTopTens and Wikipedia.
This score indicates how prevalent the pop culture reference is to the source word.
\subsubsection{Uniqueness.} A score for uniqueness place more value on words that are not commonly used, but still convey the same concept.
For example, for the word ``pants,'' ``trousers'' is more common than ``bloomers,'' although both convey the same concept.
Uniqueness is calculated using the number per million words score from MoreWords.
The uniqueness score is taken relative to all of the other synonymous words in the set as is calculated as:
$$
unique(w) = 1 - \frac{FPM(w)}{max_{x \in W}unique(x)}.
$$
% where $FPM(w)$ represents the frequency per million words from MoreWords.
\subsubsection{Pop Culture.} The pop culture score indicates if one or both of the base words are pop culture words.
This allows the user to place more value on pop culture references.
% \end{description}
 
\section{Results} 

In this section we present results from Nehovah to examine its creativity.
The name of our system (Nehovah) is a result of providing Nehovah with the source words ``neologism'' and ``creator.''
Nehovah is a mix of the words ``neologism'' and ``Jehovah.''
It is readily apparent that Nehovah incorporates the word ``Jehovah'' since only one letter was altered.
Another candidate neologism was ``Neohovah,'' which conveys more of the meaning of the word ``neologism'' but does not flow as well since an additional syllable is added.
The results for ``neologism'' and ``creator'' as source words with all of the scoring criteria equally weighted are shown in Figure \ref{figure:neologismCreator}.
Figure \ref{figure:neologismCreator-scored} shows the results for adjusting the scoring to maintain the number of syllables (left columns) and to preserve the underlying concept (right columns).
The words with parenthesis represent possible endings for the words.
For example, ``jeolog (-ism -y)'' could be either jeologism or jeology.

\begin{figure}[tp]
%\centerline{\includegraphics[width=2.5in]{../../ensemble/Fig1.eps}}
\begin{center}
\begin{tabular}{l|ll}
Neologism & \multicolumn{2}{c}{Base Words}\\
\hline
inventivine & invention & divine \\
indivention & individual & invention \\
almigvention & almighty & invention \\
almighvention & almighty & invention \\
coinaker & coinage & maker \\
coinator & coinage & creator \\
almightion & almighty & invention \\
divention & divine & invention \\
jehovention & jehovah & invention \\
coinavine & coinage & divine \\
inventidual & invention & individual \\
inventividual & invention & individual \\
indivinage & individual & coinage \\
divinage & divine & coinage \\
indivion & individual & invention \\
coinadivine & coinage & divine \\
coinagod (-head) & coinage & god (-head) \\
almiginvention & almighty & invention \\
coinaperson & coinage & person \\
coinacreator & coinage & creator \\
\end{tabular}
\end{center}
\caption{Top 20 neologisms from Nehovah combining ``neologism'' and ``creator'' with all factors equally weighted.}
\label{figure:neologismCreator}
\end{figure}

\begin{figure*}[th]
%\centerline{\includegraphics[width=2.5in]{../../ensemble/Fig1.eps}}
\begin{center}
\begin{tabular}{l|ll||l|ll}
\multicolumn{3}{c||}{Weighted for Syllable Exchange} & \multicolumn{3}{c}{Weighted for Concepts}\\
\hline
&&&&\\
Neologism & \multicolumn{2}{c||}{Base Words} & Neologism & \multicolumn{2}{c}{Base Words}\\
\hline
divinage & divine & coinage & almiginvention & almighty & invention \\
jeolog (-ism -y) & jehovah & neolog (-ism -y) & coinadivine & coinage & divine \\
nehovah & neolog & jehovah & coinacreator & coinage & creator \\
coinator & coinage & creator & coinagod (-head) & coinage & god (-head) \\
covah & coinage & jehovah & coinaalmighty & coinage & almighty \\
inventividual & invention & individual & almigvention & almighty & invention \\
inventivine & invention & divine & almighvention & almighty & invention \\
inventor & invention & creator & almighinvention & almighty & invention \\
inventy & invention & almighty & coinaperson & coinage & person \\
coidual & coinage & individual & coinasoul & coinage & soul \\
divion & divine & invention & almigcoinage & almighty & coinage \\
coividual & coinage & individual & almighcoinage & almighty & coinage \\
jeholog (-ism -y) & jehovah & neolog (-ism -y) & coinasome (-body -one) & coinage & some (-body -one) \\
neovah & neolog & jehovah & almighword & almighty & word \\
neoul & neolog & soul & almigword & almighty & word \\
coivine & coinage & divine & coinalord & coinage & lord \\
invidual & invention & individual & inventisoul & invention & soul \\
invine & invention & divine & inventiperson & invention & person \\
indivion & individual & invention & inventialmighty & invention & almighty \\
coinaker & coinage & maker & inventigod (-head) & invention & god (-head) \\
\end{tabular}
\end{center}
\caption{Top 20 neologisms from Nehovah combining ``neologism'' and ``creator'' weighting the syllable exchange (left) how prevalent the concepts are (right).}
\label{figure:neologismCreator-scored}
\end{figure*}

There were no pop culture references for ``neologism'' or ``creator.''
To illustrate an example with pop culture, we use the source words ``evil'' and ``school.''
Evil has pop culture references such as ``Lord Voldemort'' and ``Hitler,'' while school generates pop culture references such as ``history'' and ``pencil.''
With all scoring aspects equally weighted, the top 20 results for ``evil'' and ``school'' are shown in Figure \ref{figure:evilSchool}.
The blended base words are provided for each neologism as well as the list from TheTopTens that a pop culture word was taken from in the line below the word.
The word in parenthesis provides extra information given about the item from TheTopTens.
The base words that do not have any information in the line below it are taken from WordNet.
The neologism with the highest score is ``iniquilding'' which blends the words iniquity (from evil) and building (from school).
Our favorites are ``Megatronomy'' (blending Megatron from the movie ``Transformers'' and astronomy) and vicionomics (blending vicious and economics).
This also shows a bias in Nehovah to favor words that have overlapping sequences (i.e. Megatron and astronomy overlap on letters ``on'') with the assumption that it aids in blending the words and on the pronunciation of the neologism.
% TODO Specify that creativity is sometimes enhanced by knowing what the base words are. Example, paintings are enhanced by knowing the source emotion.

Sometimes the concepts from the base words in neologism are not apparent until the base words are known.
When this happens the neologism is sometimes suddenly considered clever or humorous or both.
The perception of creativity in a neologism can be enhanced by knowing what the base words are.
This phenomena is similar to how the creativity of paintings is more appreciated by knowing the source emotion.

\begin{figure*}[tp]
%\centerline{\includegraphics[width=2.5in]{../../ensemble/Fig1.eps}}
\begin{center}
\begin{tabular}{l|ll}
Neologism & \multicolumn{2}{c}{Base Words} \\
\hline
\hline
iniquilding & iniquity & building \\
\hline
napoleometry & napoleon & geometry \\
 & - Top Ten Most Evil People in History & - Top Ten Hardest School Subjects \\
\hline
jackpack & jack\_the\_ripper & backpack \\
 & - Top Ten Most Evil People in History & - Top Ten Most Needed School Supplies \\
\hline
german\_General & german & German\_General (Third on a Match) \\
& - Top Ten Best School Subjects & - The Most Evil Movie Villains of All Time \\
\hline
germann\_goering & german - Top Ten Best School Subjects & hermann\_goerings \\
& - Top Ten Best School Subjects & - Top 10 Most Evil Nazis \\
\hline
napoleriod & napoleon & period \\
 & - Top Ten Most Evil People in History & \\
\hline
louistronomy & louis\_xvi & astronomy \\
 & - Top Ten Most Evil Leaders in History & - Top Ten Best School Subjects \\
\hline
iniquitebooks & iniquity & notebooks \\
 &  & - Top Ten Most Needed School Supplies \\
\hline
naponomics & napoleon & economics \\
 & - Top Ten Most Evil People in History & - Top Ten Best School Subjects \\
\hline
napolemistry & napoleon & chemistry \\
& - Top Ten Most Evil People in History & - Top Ten Hardest School Subjects \\
\hline
napotebooks & napoleon & notebooks \\
 & - Top Ten Most Evil People in History & - Top Ten Most Needed School Supplies \\
\hline
napoleography & napoleon & geography \\
 & - Top Ten Most Evil People in History & - Top Ten Best School Subjects \\
\hline
edificent & edifice & Maleficent (Sleeping Beauty) \\
 &  & - The Most Evil Movie Villains of All Time\\
\hline
Megatronomy & Megatron (Transformers) & astronomy\\
 & - The Most Evil Movie Villains of All Time & - Top Ten Best School Subjects \\
\hline
edificer\_Teasle & edifice & Officer\_Teasle (Rambo First Blood) \\
 & & - The Most Evil Movie Villains of All Time\\
\hline
edificer\_Tenpenny & edifice & Officer\_Tenpenny (GTA) \\
& & - The 10 Most Evil Villains In Video Games \\
\hline
informapoleon & information\_technology & napoleon \\
& - Top Ten Best School Subjects & - Top Ten Most Evil People in History \\
\hline
napolealth & napoleon & health \\
 & - Top Ten Most Evil People in History & - Top Ten Best School Subjects \\
\hline
geograpoleon & geography & napoleon \\
 & - Top Ten Best School Subjects & - Top Ten Most Evil People in History \\
\hline
immorayons & immorality & crayons \\
 & & - Top Ten Most Needed School Supplies \\
\hline
viciotebooks & vicious & notebooks \\
 & & - Top Ten Most Needed School Supplies \\
\hline
vicionomics & vicious & economics \\
 & & - Top Ten Best School Subjects \\
\end{tabular}
\end{center}
\caption{Top 20 neologisms from Nehovah combining ``evil'' and ``school'' with all factors equally weighted. The second line gives additional information about the source lists from TheTopTens where the pop culture references were obtained.}
\label{figure:evilSchool}
\end{figure*}


\section{Conclusions}
In this paper, we presented Nehovah, a computationally creative system that generates neologisms by blending words.
Nehovah aims at generating neologisms that are both novel and valuable, where what is valuable can be adjusted by a user.
The underlying value that Nehovah seeks to maintain is that the concepts are prevalent in the neologisms.
Creativity is expressed in all of the stages that Nehovah implements: finding synonyms, blending words, and scoring neologisms.
In the finding synonyms stage, creativity is expressed by finding diverse words that capture the concept of the original words.
Nehovah also incorporates pop culture to provide more diversity to the set of synonymous words.
When the words are blended, Nehovah creatively splits the words so as to avoid splitting words where meaning may be lost or that lead to neologism with awkward letter combinations.
Nehovah accomplishes this by attempting to split on the syllables of the words.
Thus, rather than randomly splitting the words, Nehovah guides the search of the neologisms.
With a candidate set of neologisms, Nehovah determines which neologisms are valuable by assigning them a score.
Determining if an artifact is valuable is the most challenging aspect of creativity.
Nehovah attacks this problem by scoring the neologisms on a several different aspects (pop culture reference, concepts are apparent, syllable exchange, and uniqueness).
This approach also allows Nehovah to be used as creative assistance tool as a user can specify which aspects of a neologism are the most important.
Nehovah is also able to adapt overtime since it queries TheTopTens and Wikipedia.
The content of both of these web sites are created by the general public and reflect the ideology of current times.

One area in which Nehovah lacks is an ability to learn and adapt to user input.
This is often an important aspect in the creative process of determining what is valuable.
This process is commonly seen in engineering when developing a new product and testing the market demand as well as with artists that produce artifacts where the goal may be to get a reaction.
Currently, Nehovah encodes our intuitions of what constitute a valuable neologism, but a learning mechanism would allow Nehovah to develop some of its own preferences and intuitions.


\bibliographystyle{aaai} 
\bibliography{nehovah}
\end{document}

